Toward a Short Text Classification Framework Based on Background Knowledge Discovery

نویسنده

  • Isak Taksa
چکیده

The ubiquitous, diverse and growing impact of digital living creates a massive amount of short text a search query, a twit or a caption. Short text frequently presents itself as an arbitrary combination of semantically unconnected words. Using machine learning to classify the corpora of such texts is a challenging task. A large body of research exists in this area, but in this paper we will focus on Background Knowledge (BK) and its role in machine learning for shorttext and non-topical classification. More specifically, we present an effort to create a short text classification framework based on Background Knowledge. We propose novel Information Retrieval techniques to construct BK and demonstrate the advantages of Automatic Query Expansion (AQE) vs. basic search. We discuss other results of this research and its implications on the advancement of short text classification.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Temporal and Contextual Evaluation of Background Knowledge Discovery for Short Text Classification

Background Knowledge (BK) plays an essential role in machine learning for short-text and non-topical classification. In this paper the authors present and evaluate two Information Retrieval techniques used to assemble four sets of BK in the past seven years. These sets were applied to classify a commercial corpus of search queries by the apparent age of the user. Temporal and contextual evaluat...

متن کامل

Knowledge discovery from patients’ behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services

The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer ...

متن کامل

Knowledge discovery from patients’ behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services

The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer ...

متن کامل

Indeterminacy, Discovery and Polyphony in Houshang Golshiri's Short Stories

Houshang Golshiri is among the Iranian leading creative and imaginative fiction writers who managed to open up new horizons in Iranian fiction. Hence he could be claimed to be an innovative avant-garde short story writer with unique stylistic characteristics. Although inspired by fiction writers such as Alavi, Sadeqi, Golestan and Sa'edi in the techniques of narration, Golshiri nonetheless stan...

متن کامل

The Impact of Contextual Clue Selection on Inference

Linguistic information can be conveyed in the form of speech and written text, but it is the content of the message that is ultimately essential for higher-level processes in language comprehension, such as making inferences and associations between text information and knowledge about the world. Linguistically, inference is the shovel that allows receivers to dig meaning out from the text with...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015